Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-38557614

RESUMO

As post-transcriptional regulators of gene expression, micro-ribonucleic acids (miRNAs) are regarded as potential biomarkers for a variety of diseases. Hence, the prediction of miRNA-disease associations (MDAs) is of great significance for an in-depth understanding of disease pathogenesis and progression. Existing prediction models are mainly concentrated on incorporating different sources of biological information to perform the MDA prediction task while failing to consider the fully potential utility of MDA network information at the motif-level. To overcome this problem, we propose a novel motif-aware MDA prediction model, namely MotifMDA, by fusing a variety of high- and low-order structural information. In particular, we first design several motifs of interest considering their ability to characterize how miRNAs are associated with diseases through different network structural patterns. Then, MotifMDA adopts a two-layer hierarchical attention to identify novel MDAs. Specifically, the first attention layer learns high-order motif preferences based on their occurrences in the given MDA network, while the second one learns the final embeddings of miRNAs and diseases through coupling high- and low-order preferences. Experimental results on two benchmark datasets have demonstrated the superior performance of MotifMDA over several state-of-the-art prediction models. This strongly indicates that accurate MDA prediction can be achieved by relying solely on MDA network information. Furthermore, our case studies indicate that the incorporation of motif-level structure information allows MotifMDA to discover novel MDAs from different perspectives. The data and codes are available at https://github.com/stevejobws/MotifMDA.

2.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38426324

RESUMO

Emerging clinical evidence suggests that sophisticated associations with circular ribonucleic acids (RNAs) (circRNAs) and microRNAs (miRNAs) are a critical regulatory factor of various pathological processes and play a critical role in most intricate human diseases. Nonetheless, the above correlations via wet experiments are error-prone and labor-intensive, and the underlying novel circRNA-miRNA association (CMA) has been validated by numerous existing computational methods that rely only on single correlation data. Considering the inadequacy of existing machine learning models, we propose a new model named BGF-CMAP, which combines the gradient boosting decision tree with natural language processing and graph embedding methods to infer associations between circRNAs and miRNAs. Specifically, BGF-CMAP extracts sequence attribute features and interaction behavior features by Word2vec and two homogeneous graph embedding algorithms, large-scale information network embedding and graph factorization, respectively. Multitudinous comprehensive experimental analysis revealed that BGF-CMAP successfully predicted the complex relationship between circRNAs and miRNAs with an accuracy of 82.90% and an area under receiver operating characteristic of 0.9075. Furthermore, 23 of the top 30 miRNA-associated circRNAs of the studies on data were confirmed in relevant experiences, showing that the BGF-CMAP model is superior to others. BGF-CMAP can serve as a helpful model to provide a scientific theoretical basis for the study of CMA prediction.


Assuntos
MicroRNAs , Humanos , MicroRNAs/genética , RNA Circular/genética , Curva ROC , Aprendizado de Máquina , Algoritmos , Biologia Computacional/métodos
3.
Brief Bioinform ; 25(2)2024 Jan 22.
Artigo em Inglês | MEDLINE | ID: mdl-38324624

RESUMO

Connections between circular RNAs (circRNAs) and microRNAs (miRNAs) assume a pivotal position in the onset, evolution, diagnosis and treatment of diseases and tumors. Selecting the most potential circRNA-related miRNAs and taking advantage of them as the biological markers or drug targets could be conducive to dealing with complex human diseases through preventive strategies, diagnostic procedures and therapeutic approaches. Compared to traditional biological experiments, leveraging computational models to integrate diverse biological data in order to infer potential associations proves to be a more efficient and cost-effective approach. This paper developed a model of Convolutional Autoencoder for CircRNA-MiRNA Associations (CA-CMA) prediction. Initially, this model merged the natural language characteristics of the circRNA and miRNA sequence with the features of circRNA-miRNA interactions. Subsequently, it utilized all circRNA-miRNA pairs to construct a molecular association network, which was then fine-tuned by labeled samples to optimize the network parameters. Finally, the prediction outcome is obtained by utilizing the deep neural networks classifier. This model innovatively combines the likelihood objective that preserves the neighborhood through optimization, to learn the continuous feature representation of words and preserve the spatial information of two-dimensional signals. During the process of 5-fold cross-validation, CA-CMA exhibited exceptional performance compared to numerous prior computational approaches, as evidenced by its mean area under the receiver operating characteristic curve of 0.9138 and a minimal SD of 0.0024. Furthermore, recent literature has confirmed the accuracy of 25 out of the top 30 circRNA-miRNA pairs identified with the highest CA-CMA scores during case studies. The results of these experiments highlight the robustness and versatility of our model.


Assuntos
MicroRNAs , Neoplasias , Humanos , MicroRNAs/genética , RNA Circular/genética , Funções Verossimilhança , Redes Neurais de Computação , Neoplasias/genética , Biologia Computacional/métodos
4.
Methods ; 220: 106-114, 2023 12.
Artigo em Inglês | MEDLINE | ID: mdl-37972913

RESUMO

Discovering new indications for existing drugs is a promising development strategy at various stages of drug research and development. However, most of them complete their tasks by constructing a variety of heterogeneous networks without considering available higher-order connectivity patterns in heterogeneous biological information networks, which are believed to be useful for improving the accuracy of new drug discovering. To this end, we propose a computational-based model, called SFRLDDA, for drug-disease association prediction by using semantic graph and function similarity representation learning. Specifically, SFRLDDA first integrates a heterogeneous information network (HIN) by drug-disease, drug-protein, protein-disease associations, and their biological knowledge. Second, different representation learning strategies are applied to obtain the feature representations of drugs and diseases from different perspectives over semantic graph and function similarity graphs constructed, respectively. At last, a Random Forest classifier is incorporated by SFRLDDA to discover potential drug-disease associations (DDAs). Experimental results demonstrate that SFRLDDA yields a best performance when compared with other state-of-the-art models on three benchmark datasets. Moreover, case studies also indicate that the simultaneous consideration of semantic graph and function similarity of drugs and diseases in the HIN allows SFRLDDA to precisely predict DDAs in a more comprehensive manner.


Assuntos
Algoritmos , Semântica , Serviços de Informação
5.
BMC Bioinformatics ; 24(1): 451, 2023 Nov 29.
Artigo em Inglês | MEDLINE | ID: mdl-38030973

RESUMO

BACKGROUND: As an important task in bioinformatics, clustering analysis plays a critical role in understanding the functional mechanisms of many complex biological systems, which can be modeled as biological networks. The purpose of clustering analysis in biological networks is to identify functional modules of interest, but there is a lack of online clustering tools that visualize biological networks and provide in-depth biological analysis for discovered clusters. RESULTS: Here we present BioCAIV, a novel webserver dedicated to maximize its accessibility and applicability on the clustering analysis of biological networks. This, together with its user-friendly interface, assists biological researchers to perform an accurate clustering analysis for biological networks and identify functionally significant modules for further assessment. CONCLUSIONS: BioCAIV is an efficient clustering analysis webserver designed for a variety of biological networks. BioCAIV is freely available without registration requirements at http://bioinformatics.tianshanzw.cn:8888/BioCAIV/ .


Assuntos
Biologia Computacional , Software , Análise por Conglomerados
7.
Bioinformatics ; 39(8)2023 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-37505483

RESUMO

MOTIVATION: The task of predicting drug-target interactions (DTIs) plays a significant role in facilitating the development of novel drug discovery. Compared with laboratory-based approaches, computational methods proposed for DTI prediction are preferred due to their high-efficiency and low-cost advantages. Recently, much attention has been attracted to apply different graph neural network (GNN) models to discover underlying DTIs from heterogeneous biological information network (HBIN). Although GNN-based prediction methods achieve better performance, they are prone to encounter the over-smoothing simulation when learning the latent representations of drugs and targets with their rich neighborhood information in HBIN, and thereby reduce the discriminative ability in DTI prediction. RESULTS: In this work, an improved graph representation learning method, namely iGRLDTI, is proposed to address the above issue by better capturing more discriminative representations of drugs and targets in a latent feature space. Specifically, iGRLDTI first constructs an HBIN by integrating the biological knowledge of drugs and targets with their interactions. After that, it adopts a node-dependent local smoothing strategy to adaptively decide the propagation depth of each biomolecule in HBIN, thus significantly alleviating over-smoothing by enhancing the discriminative ability of feature representations of drugs and targets. Finally, a Gradient Boosting Decision Tree classifier is used by iGRLDTI to predict novel DTIs. Experimental results demonstrate that iGRLDTI yields better performance that several state-of-the-art computational methods on the benchmark dataset. Besides, our case study indicates that iGRLDTI can successfully identify novel DTIs with more distinguishable features of drugs and targets. AVAILABILITY AND IMPLEMENTATION: Python codes and dataset are available at https://github.com/stevejobws/iGRLDTI/.


Assuntos
Descoberta de Drogas , Redes Neurais de Computação , Simulação por Computador , Descoberta de Drogas/métodos , Interações Medicamentosas
8.
Org Lett ; 25(25): 4700-4704, 2023 06 30.
Artigo em Inglês | MEDLINE | ID: mdl-37314939

RESUMO

Severe side effects and drug resistance are major drawbacks of Pt-based chemotherapy in clinical practice, leading to the search for new Pt-based drugs through the tuning of coordination ligands. Therefore, seeking appropriate ligands has attracted significant interest in this area. In this study, we report a Ni-catalyzed coupling strategy for the divergent synthesis of diphenic acid derivatives and the application of these newly prepared acids in Pt(II) agent synthesis.


Assuntos
Compostos de Bifenilo , Ligantes , Catálise
9.
Mol Ther Nucleic Acids ; 32: 721-728, 2023 Jun 13.
Artigo em Inglês | MEDLINE | ID: mdl-37251691

RESUMO

Identifying proteins that interact with drug compounds has been recognized as an important part in the process of drug discovery. Despite extensive efforts that have been invested in predicting compound-protein interactions (CPIs), existing traditional methods still face several challenges. The computer-aided methods can identify high-quality CPI candidates instantaneously. In this research, a novel model is named GraphCPIs, proposed to improve the CPI prediction accuracy. First, we establish the adjacent matrix of entities connected to both drugs and proteins from the collected dataset. Then, the feature representation of nodes could be obtained by using the graph convolutional network and Grarep embedding model. Finally, an extreme gradient boosting (XGBoost) classifier is exploited to identify potential CPIs based on the stacked two kinds of features. The results demonstrate that GraphCPIs achieves the best performance, whose average predictive accuracy rate reaches 90.09%, average area under the receiver operating characteristic curve is 0.9572, and the average area under the precision and recall curve is 0.9621. Moreover, comparative experiments reveal that our method surpasses the state-of-the-art approaches in the field of accuracy and other indicators with the same experimental environment. We believe that the GraphCPIs model will provide valuable insight to discover novel candidate drug-related proteins.

10.
BMC Bioinformatics ; 23(1): 516, 2022 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-36456957

RESUMO

BACKGROUND: Drug repositioning is a very important task that provides critical information for exploring the potential efficacy of drugs. Yet developing computational models that can effectively predict drug-disease associations (DDAs) is still a challenging task. Previous studies suggest that the accuracy of DDA prediction can be improved by integrating different types of biological features. But how to conduct an effective integration remains a challenging problem for accurately discovering new indications for approved drugs. METHODS: In this paper, we propose a novel meta-path based graph representation learning model, namely RLFDDA, to predict potential DDAs on heterogeneous biological networks. RLFDDA first calculates drug-drug similarities and disease-disease similarities as the intrinsic biological features of drugs and diseases. A heterogeneous network is then constructed by integrating DDAs, disease-protein associations and drug-protein associations. With such a network, RLFDDA adopts a meta-path random walk model to learn the latent representations of drugs and diseases, which are concatenated to construct joint representations of drug-disease associations. As the last step, we employ the random forest classifier to predict potential DDAs with their joint representations. RESULTS: To demonstrate the effectiveness of RLFDDA, we have conducted a series of experiments on two benchmark datasets by following a ten-fold cross-validation scheme. The results show that RLFDDA yields the best performance in terms of AUC and F1-score when compared with several state-of-the-art DDAs prediction models. We have also conducted a case study on two common diseases, i.e., paclitaxel and lung tumors, and found that 7 out of top-10 diseases and 8 out of top-10 drugs have already been validated for paclitaxel and lung tumors respectively with literature evidence. Hence, the promising performance of RLFDDA may provide a new perspective for novel DDAs discovery over heterogeneous networks.


Assuntos
Aprendizagem , Neoplasias Pulmonares , Humanos , Benchmarking , Descoberta de Drogas , Paclitaxel
11.
Brief Bioinform ; 23(6)2022 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-36125202

RESUMO

Drug repositioning (DR) is a promising strategy to discover new indicators of approved drugs with artificial intelligence techniques, thus improving traditional drug discovery and development. However, most of DR computational methods fall short of taking into account the non-Euclidean nature of biomedical network data. To overcome this problem, a deep learning framework, namely DDAGDL, is proposed to predict drug-drug associations (DDAs) by using geometric deep learning (GDL) over heterogeneous information network (HIN). Incorporating complex biological information into the topological structure of HIN, DDAGDL effectively learns the smoothed representations of drugs and diseases with an attention mechanism. Experiment results demonstrate the superior performance of DDAGDL on three real-world datasets under 10-fold cross-validation when compared with state-of-the-art DR methods in terms of several evaluation metrics. Our case studies and molecular docking experiments indicate that DDAGDL is a promising DR tool that gains new insights into exploiting the geometric prior knowledge for improved efficacy.


Assuntos
Aprendizado Profundo , Reposicionamento de Medicamentos , Reposicionamento de Medicamentos/métodos , Inteligência Artificial , Simulação de Acoplamento Molecular , Serviços de Informação , Algoritmos , Biologia Computacional/métodos
12.
Brief Bioinform ; 23(5)2022 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-36088547

RESUMO

A large amount of clinical evidence began to mount, showing that circular ribonucleic acids (RNAs; circRNAs) perform a very important function in complex diseases by participating in transcription and translation regulation of microRNA (miRNA) target genes. However, with strict high-throughput techniques based on traditional biological experiments and the conditions and environment, the association between circRNA and miRNA can be discovered to be labor-intensive, expensive, time-consuming, and inefficient. In this paper, we proposed a novel computational model based on Word2vec, Structural Deep Network Embedding (SDNE), Convolutional Neural Network and Deep Neural Network, which predicts the potential circRNA-miRNA associations, called Word2vec, SDNE, Convolutional Neural Network and Deep Neural Network (WSCD). Specifically, the WSCD model extracts attribute feature and behaviour feature by word embedding and graph embedding algorithm, respectively, and ultimately feed them into a feature fusion model constructed by combining Convolutional Neural Network and Deep Neural Network to deduce potential circRNA-miRNA interactions. The proposed method is proved on dataset and obtained a prediction accuracy and an area under the receiver operating characteristic curve of 81.61% and 0.8898, respectively, which is shown to have much higher accuracy than the state-of-the-art models and classifier models in prediction. In addition, 23 miRNA-related circular RNAs (circRNAs) from the top 30 were confirmed in relevant experiences. In these works, all results represent that WSCD would be a helpful supplementary reliable method for predicting potential miRNA-circRNA associations compared to wet laboratory experiments.


Assuntos
MicroRNAs , RNA Circular , Algoritmos , MicroRNAs/genética , Redes Neurais de Computação , Curva ROC
13.
IEEE J Biomed Health Inform ; 26(10): 5075-5084, 2022 10.
Artigo em Inglês | MEDLINE | ID: mdl-35976848

RESUMO

Increasing evidence suggest that circRNA, as one of the most promising emerging biomarkers, has a very close relationship with diseases. Exploring the relationship between circRNA and diseases can provide novel perspective for diseases diagnosis and pathogenesis. The existing circRNA-disease association (CDA) prediction models, however, generally treat the data attributes equally, do not pay special attention to the attributes with more significant influence, and do not make full use of the correlation and symbiosis between attributes to dig into the latent semantic information of the data. Therefore, in response to the above problems, this paper proposes a natural semantic enhancement method NSECDA to predict CDA. In practical terms, we first recognize the circRNA sequence as a biological language, and analyze its natural semantic properties through the natural language understanding theory; then integrate it with disease attributes, circRNA and disease Gaussian Interaction Profile (GIP) kernel attributes, and use Graph Attention Network (GAT) to focus on the influential attributes, so as to mine the deeply hidden features; finally, the Rotation Forest (RoF) classifier was used to accurately determine CDA. In the gold standard data set CircR2Disease, NSECDA achieved 92.49% accuracy with 0.9225 AUC score. In comparison with the non-natural semantic enhancement model and other classifier models, NSECDA also shows competitive performance. Additionally, 25 of the CDA pairs with unknown associations in the top 30 prediction scores of NSECDA have been proven by newly reported studies. These achievements suggest that NSECDA is an effective model to predict CDA, which can provide credible candidate for subsequent wet experiments, thus significantly reducing the scope of investigations.


Assuntos
RNA Circular , Semântica , Algoritmos , Biologia Computacional/métodos , Humanos , RNA Circular/genética
14.
Org Lett ; 24(23): 4155-4159, 2022 06 17.
Artigo em Inglês | MEDLINE | ID: mdl-35658460

RESUMO

The utilization of readily available starting materials to produce useful molecules is often challenged by selectivity issues. In this study, a Ni-catalyzed protecting-group-free C-C coupling protocol is described for the efficient synthesis of 2,2'-biphenol derivatives. Its remarkable chemoselectivity control ability, wide substrate scope, and excellent functional group tolerance highlight this newly developed strategy. Detailed mechanistic studies have demonstrated that potassium tert-butoxide acts as a critical agent to prevent the occurrence of protonation events.


Assuntos
Catálise , Fenóis
15.
BMC Bioinformatics ; 23(1): 234, 2022 Jun 16.
Artigo em Inglês | MEDLINE | ID: mdl-35710342

RESUMO

BACKGROUND: Protein-protein interaction (PPI) plays an important role in regulating cells and signals. Despite the ongoing efforts of the bioassay group, continued incomplete data limits our ability to understand the molecular roots of human disease. Therefore, it is urgent to develop a computational method to predict PPIs from the perspective of molecular system. METHODS: In this paper, a highly efficient computational model, MTV-PPI, is proposed for PPI prediction based on a heterogeneous molecular network by learning inter-view protein sequences and intra-view interactions between molecules simultaneously. On the one hand, the inter-view feature is extracted from the protein sequence by k-mer method. On the other hand, we use a popular embedding method LINE to encode the heterogeneous molecular network to obtain the intra-view feature. Thus, the protein representation used in MTV-PPI is constructed by the aggregation of its inter-view feature and intra-view feature. Finally, random forest is integrated to predict potential PPIs. RESULTS: To prove the effectiveness of MTV-PPI, we conduct extensive experiments on a collected heterogeneous molecular network with the accuracy of 86.55%, sensitivity of 82.49%, precision of 89.79%, AUC of 0.9301 and AUPR of 0.9308. Further comparison experiments are performed with various protein representations and classifiers to indicate the effectiveness of MTV-PPI in predicting PPIs based on a complex network. CONCLUSION: The achieved experimental results illustrate that MTV-PPI is a promising tool for PPI prediction, which may provide a new perspective for the future interactions prediction researches based on heterogeneous molecular network.


Assuntos
Mapeamento de Interação de Proteínas , Proteínas , Sequência de Aminoácidos , Biologia Computacional/métodos , Humanos , Mapeamento de Interação de Proteínas/métodos , Proteínas/metabolismo
16.
Brief Bioinform ; 23(3)2022 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-35323894

RESUMO

While the technologies of ribonucleic acid-sequence (RNA-seq) and transcript assembly analysis have continued to improve, a novel topology of RNA transcript was uncovered in the last decade and is called circular RNA (circRNA). Recently, researchers have revealed that they compete with messenger RNA (mRNA) and long noncoding for combining with microRNA in gene regulation. Therefore, circRNA was assumed to be associated with complex disease and discovering the relationship between them would contribute to medical research. However, the work of identifying the association between circRNA and disease in vitro takes a long time and usually without direction. During these years, more and more associations were verified by experiments. Hence, we proposed a computational method named identifying circRNA-disease association based on graph representation learning (iGRLCDA) for the prediction of the potential association of circRNA and disease, which utilized a deep learning model of graph convolution network (GCN) and graph factorization (GF). In detail, iGRLCDA first derived the hidden feature of known associations between circRNA and disease using the Gaussian interaction profile (GIP) kernel combined with disease semantic information to form a numeric descriptor. After that, it further used the deep learning model of GCN and GF to extract hidden features from the descriptor. Finally, the random forest classifier is introduced to identify the potential circRNA-disease association. The five-fold cross-validation of iGRLCDA shows strong competitiveness in comparison with other excellent prediction models at the gold standard data and achieved an average area under the receiver operating characteristic curve of 0.9289 and an area under the precision-recall curve of 0.9377. On reviewing the prediction results from the relevant literature, 22 of the top 30 predicted circRNA-disease associations were noted in recent published papers. These exceptional results make us believe that iGRLCDA can provide reliable circRNA-disease associations for medical research and reduce the blindness of wet-lab experiments.


Assuntos
MicroRNAs , RNA Circular , Algoritmos , Biologia Computacional/métodos , MicroRNAs/genética , Curva ROC
17.
Brief Bioinform ; 23(1)2022 01 17.
Artigo em Inglês | MEDLINE | ID: mdl-34891172

RESUMO

Identifying new indications for drugs plays an essential role at many phases of drug research and development. Computational methods are regarded as an effective way to associate drugs with new indications. However, most of them complete their tasks by constructing a variety of heterogeneous networks without considering the biological knowledge of drugs and diseases, which are believed to be useful for improving the accuracy of drug repositioning. To this end, a novel heterogeneous information network (HIN) based model, namely HINGRL, is proposed to precisely identify new indications for drugs based on graph representation learning techniques. More specifically, HINGRL first constructs a HIN by integrating drug-disease, drug-protein and protein-disease biological networks with the biological knowledge of drugs and diseases. Then, different representation strategies are applied to learn the features of nodes in the HIN from the topological and biological perspectives. Finally, HINGRL adopts a Random Forest classifier to predict unknown drug-disease associations based on the integrated features of drugs and diseases obtained in the previous step. Experimental results demonstrate that HINGRL achieves the best performance on two real datasets when compared with state-of-the-art models. Besides, our case studies indicate that the simultaneous consideration of network topology and biological knowledge of drugs and diseases allows HINGRL to precisely predict drug-disease associations from a more comprehensive perspective. The promising performance of HINGRL also reveals that the utilization of rich heterogeneous information provides an alternative view for HINGRL to identify novel drug-disease associations especially for new diseases.


Assuntos
Serviços de Informação , Aprendizado de Máquina , Preparações Farmacêuticas , Algoritmos , Biologia Computacional/métodos , Doença , Reposicionamento de Medicamentos/métodos , Humanos , Modelos Teóricos , Redes Neurais de Computação
18.
Front Genet ; 12: 657182, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34054920

RESUMO

Drug repositioning is an application-based solution based on mining existing drugs to find new targets, quickly discovering new drug-disease associations, and reducing the risk of drug discovery in traditional medicine and biology. Therefore, it is of great significance to design a computational model with high efficiency and accuracy. In this paper, we propose a novel computational method MGRL to predict drug-disease associations based on multi-graph representation learning. More specifically, MGRL first uses the graph convolution network to learn the graph representation of drugs and diseases from their self-attributes. Then, the graph embedding algorithm is used to represent the relationships between drugs and diseases. Finally, the two kinds of graph representation learning features were put into the random forest classifier for training. To the best of our knowledge, this is the first work to construct a multi-graph to extract the characteristics of drugs and diseases to predict drug-disease associations. The experiments show that the MGRL can achieve a higher AUC of 0.8506 based on five-fold cross-validation, which is significantly better than other existing methods. Case study results show the reliability of the proposed method, which is of great significance for practical applications.

19.
Cancers (Basel) ; 13(9)2021 Apr 27.
Artigo em Inglês | MEDLINE | ID: mdl-33925568

RESUMO

Identification of drug-target interactions (DTIs) is a significant step in the drug discovery or repositioning process. Compared with the time-consuming and labor-intensive in vivo experimental methods, the computational models can provide high-quality DTI candidates in an instant. In this study, we propose a novel method called LGDTI to predict DTIs based on large-scale graph representation learning. LGDTI can capture the local and global structural information of the graph. Specifically, the first-order neighbor information of nodes can be aggregated by the graph convolutional network (GCN); on the other hand, the high-order neighbor information of nodes can be learned by the graph embedding method called DeepWalk. Finally, the two kinds of feature are fed into the random forest classifier to train and predict potential DTIs. The results show that our method obtained area under the receiver operating characteristic curve (AUROC) of 0.9455 and area under the precision-recall curve (AUPR) of 0.9491 under 5-fold cross-validation. Moreover, we compare the presented method with some existing state-of-the-art methods. These results imply that LGDTI can efficiently and robustly capture undiscovered DTIs. Moreover, the proposed model is expected to bring new inspiration and provide novel perspectives to relevant researchers.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...